A clustered file system is a file system which is shared by being simultaneously mounted on multiple servers. There are several approaches to clustering, most of which do not employ a clustered file system. While many computer clusters don't use clustered file systems, unless servers are underpinned by a clustered file system the complexity of the underlying storage environment increases as servers are added.
Contents |
A shared disk file system uses a storage area network (SAN) or RAID to provide direct disk access from multiple computers at the block level. Translation from file-level operations that applications use to block-level operations used by the SAN must take place on the client node. The most common type of clustered file system, a shared disk file system adds a mechanism for concurrency control which gives a consistent and serializable view of the file system, avoiding corruption and unintended data loss even when multiple clients try to access the same files at the same time. Shared disk file systems also usually employ some sort of a fencing mechanism to prevent data corruption in case of node failures.
The underlying storage area network might use any of a number of block-level protocols, including SCSI, iSCSI, HyperSCSI, ATA over Ethernet (AoE), Fibre Channel, and InfiniBand.
There are different architectural approaches to a shared disk file system. Some distribute file information across all the servers in a cluster (fully distributed). Others utilize a centralized metadata server. Both achieve the same result of enabling all servers to access all the data on a shared storage device.
Scientists working at ALICE will be using a 4 Gbit/s Fibre Channel SAN with a clustered file system to store the massive amount of data generated by the experiment (estimated at 1 GB/second for a month). Reasons quoted for this choice include "performance, scalability and vendor independence"[1]
Distributed file systems do not share block level access to the same storage but use a network protocol.
Network Attached Storage provides both storage and a file system, like a SAN + shared disk file system. NAS typically uses file-based protocols (as opposed to block-based protocols) such as NFS (popular on UNIX systems), SMB/CIFS (Server Message Block/Common Internet File System) (used with MS Windows systems), or AFP (used with Apple Macintosh computers).
The failure of disk hardware can create a single point of failure that can result in data loss. To avoid this problem, a shared nothing architecture can be employed. Each storage node communicates changes to other nodes or to a master, for replication purposes. If a single disk fails, other copies can be used to reconstruct or replace it on the fly so "nothing" is lost. To enable this feature, clients must be unaware of the physical location of a file. A single global file system is presented to clients, so the file system itself deals with allocations and low-level failures. Examples of this type of file system are found in products such as Ceph, Lustre, Isilon, IBRIX Fusion, and Hadoop.
IBM mainframes in the 1970s could share physical disks and file systems if each machine had its own channel connection to the drives' control units. In the 1980s, Digital Equipment Corporation's TOPS-20 and VAX/VMS clusters included shared disk filesystems.